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Abstract. In this paper we propose and study a new complexity model for ap- 
proximation algorithms. The main motivation are practical problems over large 
data sets that need to be solved many times for different scenarios, e.g., many 
multicast trees that need to be constructed for different groups of users. In our 
model we allow a preprocessing phase, when some information of the input graph 
G = (V, E) is stored in a limited size data structure. Next, the data structure en- 
ables processing queries of the form "solve problem A for an input S C V". We 
consider problems like STEINER FOREST, FACILITY LOCATION, fc-MEDIAN, 
fc-CENTER and TSP in the case when the graph induces a doubling metric. Our 
main results are data structures of near-linear size that are able to answer queries 
in time close to linear in \S\. This improves over typical worst case reuniting 
time of approximation algorithms in the classical setting which is f}(\E\) in- 
dependently of the query size. In most cases, our approximation guarantees are 
arbitrarily close to those in the classical setting. Additionally, we present the first 
fully dynamic algorithm for the Steiner tree problem. 



1 Introduction 

Motivation The complexity and size of the existing communication networks has grown 
extremely in the recent times. It is now hard to imagine that a group of users willing 
to communicate sets up a minimum cost communication network or a multicast tree 
according to an approximate solution to STEINER Tree problem. Instead we are forced 
to use heuristics that are computationally more efficient but may deliver suboptimal 
results [27 20]. It is easy to imagine other problems that in principle can be solved with 
constant approximation factors using state of art algorithms, but due to immense size of 
the data it is impossible in timely manner. However, in many applications the network 
is fixed and we need to solve the problem many times for different groups of users. 

Here, we propose a completely new approach that exploits this fact to overcome 
the obstacles stemming from huge data sizes. It is able to efficiently deliver results 
that have good approximation guarantee thanks to the following two assumptions. We 
assume that the network can be preprocessed beforehand and that the group of users that 
communicates is substantially smaller than the size of the network. The preprocessing 
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step is independent of the group of users and hence afterwards we can, for example, 
efficiently compute a Steiner tree for any set of users. 

More formally, in the STEINER Tree problem the algorithm is given a weighted 
graph G = (V, E) on n vertices and is allowed some preprocessing. The results of the 
preprocessing step need to be stored in limited memory. Afterwards, the set S C V of 
terminals is defined and the algorithm should generate as fast as possible a Steiner tree 
for S, i.e., a tree in G of low weight which contains all vertices in S. Given the query 
set S of k vertices we should compute the Steiner tree T in time depending only (or, 
mostly) on k. 

The trivial approach to this problem is to compute the metric closure G* of G and 
then answer each query by solving the STEINER Tree problem on G*[5]. This ap- 
proach delivers results with constant approximation ratio, but requires 0{n 2 ) space of 
the data structure and 0(k 2 ) query time. Hence it is far from being practical. In this 
work we aim at solutions that substantially improve both of these bounds; more for- 
mally the data structure space should be close to 0(n), while the query time should be 
close to 0(k). Since in a typical situation probably k = 0(\og n), so even a 0{k log n) 
query time is not considered fast enough, as then k \ogn = 9(k 2 ). Note that the 0(n) 
bound on the structure size is very restrictive: in a way, this bound is sublinear in the 
sense that we are allowed neither to store the whole distance matrix, nor (if G is dense) 
all the edges of G. This models a situation when during the preprocessing one can use 
vast resources (e.g., a huge cluster of servers), but the resources are not granted forever 
and when the system processes the queries the available space is much smaller. 

New Model In our model, computations are divided into two stages: the preprocess- 
ing stage and the query stage. In the preprocessing stage, the input is a weighted graph 
G = (V, E) and we should compute our data structure in polynomial time and space. 
Apart from the graph G some additional, problem-specific information may be also 
provided. In the query stage the algorithm is given the data structure computed in the 
preprocessing stage, but not G itself, and a set S of points of V (the query — possibly 
a set of pairs of points from V, or a weighted set of points from V, etc.) and com- 
putes a solution for the set S. The definition of "the solution for the set S" depends on 
the specific problem. In this work we consider so-called metric problems, so G corre- 
sponds to a metric space (V, d) where d can be represented as the full distance matrix 
M. One should keep in mind that the function d cannot be quickly computed (e.g. in 
constant time) without the fl(n 2 ) size matrix M. In particular, we assume that there is 
no distance oracle available in the query stage. 

Hence, there are three key parameters of an algorithm within our model: the size of 
the data structure, the query time and the approximation ratio. Less important, but not 
irrelevant is the preprocessing time. Let us note that though our model is inspired by 
large datasets, in this work we ignore streaming effects, external memory issues etc. 

Above we have formulated the STEINER TREE problem in our model, now we de- 
scribe the remaining problems. In STEINER FOREST problem the algorithm is allowed 
to preprocess a weighted graph G = (V, E), whereas the query is composed of the set 
S C V x V of pairs. The algorithm should generate the Steiner forest for S, i.e., a sub- 
graph H of G of small weight such that each pair in S is connected in H . In FACILITY 
Location problem the algorithm is given in the preprocessing phase a weighted graph 



with facility opening costs in the nodes. We consider two variants of this problem in our 
model. In the variant with unrestricted facilities, the query is a set S C V of clients for 
which we should open facilities. The goal is to open a subset F C V of facilities, and 
connect each city to an open facility so that the sum of the total opening and connection 
costs is minimized. In the other variant, one with restricted facilities, the facilities that 
can be opened are given as a part of query (together with their opening costs). 

Our Results In this paper we restrict our attention to doubling metric spaces which 
include growth-restricted metric spaces and constant dimensional Euclidean spaces. In 
other words we assume that the graph G induces a doubling metric and the algorithms 
are given the distance matrix G* as an input or compute it at the beginning of the prepro- 
cessing phase. This restriction is often assumed in the routing setting 112171 and hence it 
is a natural question to see how it can impact the multicast problems. Using this assump- 
tion we show that solutions with nearly optimal bounds are possible. The main result of 
the paper is the data structure that requires 0(n log n) memory and can find a constant 
ratio approximate Steiner tree over a given set of size k in 0(fc(log k + log log n)) time. 
Moreover, we show data structures with essentially the same complexities for solving 
Steiner Forest, both versions of Facility Location, /c-Median and TSP. The 
query bound is optimal, up to log k and log log n factors, as no algorithm can answer 
queries in time less than linear in k as it needs to read the input. For the exact approxi- 
mation ratios of our algorithms refer to Sections [3. 2l andlEl 

All of these results are based on a new hierarchical data structure for representing a 
doubling metric that approximates original distances with (1 + e)-multiplicative factor. 
The concept of a hierarchical data structure for representing a doubling metric is not 
novel - it originates from the work of Clarkson [ 8 1 and was then used in a number of 
papers, in particular our data structure is based on the one due to Jia et al. [ 16|. Our 
main technical contribution here is adapting and extending this data structure so that for 
any subset S C V a substructure corresponding to S can be retrieved in 0(fc(log k + 
log log n)) using only the information in the data structure, without a distance oracle. 
The substructure is then transformed to a pseudo-spanner described above. Note that our 
complexity bounds do not depend on the stretch of the metrics, unlike in many previous 
works (e.g. [17]). Another original concept in our work is an application of spanners (or, 
more precisely, pseudo-spanners) to improve working time of approximation algorithms 
for metric problems. As a result, the query times for the metric problems we consider 
are 0(fc(polylog£; + log log n)). 

Astonishingly, our hierarchical data structure can be used to obtain dynamic al- 
gorithms for the Steiner tree problem. This problem attracted considerable atten- 
tion [3 5 11 4 1 in the recent years. However, due to the hardness of the problem none of 
these papers has given any improvement in the running time over the static algorithms. 
Here, we give first fully dynamic algorithm for the problem in the case of doubling 
metric. Our algorithm is given a static graph and then maintains information about 
the Steiner tree built on a given set X of nodes. It supports insertion of vertices in 
0(log 5 k + log log n) time, and deletion in (9(log 5 fe) time, where k = \X\. 

Related Work The problems considered in this paper are related to several algorithmic 
topics studied extensively in recent years. Many researchers tried to answer the question 



whether problems in huge networks can be solved more efficiently than by processing 
the whole input. Nevertheless, the model proposed in this paper has never been consid- 
ered before. Moreover, we believe that within the proposed framework it is possible to 
achieve complexities that are close to being practical. We present such results only in 
the case of doubling metric, but hope that the further study will extend these results to 
a more general setting. Our results are related to the following concepts: 

- Universal Algorithms — this model does not allow any processing in the query 
time, we allow it and get much better approximation ratios, 

- Spanners and Approximate Distance Oracles — although a spanner of a subspace of 
a doubling metric can be constructed in 0(k log fc)-time, the construction algorithm 
requires a distance oracle (i.e. the full 6>(n 2 )-size distance matrix). 

- Sublinear Approximation Algorithms — here we cannot preprocess the data, al- 
lowing it we can get much better approximation ratios, 

- Dynamic Spanning Trees — most existing results are only applicable to dynamic 
MST and not dynamic Steiner tree, and the ones concerning the latter work in 
different models than ours. 

Due to space limitation of this extended abstract an extensive discussion of the related 
work is attached in AppendixlAland will be included in the full version of the paper. 

2 Space partition tree 

In this section we extend the techniques developed by Jia et al. [ 16 1. Several statements 
as well as the overall construction are similar to those given by Jia et al. However, 
our approach is tuned to better suit our needs, in particular to allow for a fast subtree 
extraction and a spanner construction - techniques introduced in Sections [2] and [3] that 
are crucial for efficient approximation algorithms. 

Let (V, d) be a finite doubling metric space with |V| = n and a doubling constant 
A, i.e., for every r > 0, every ball of radius 2r can be covered with at most A balls of 
radius r. By stretch we denote the stretch of the metric d, that is, the largest distance 
in V divided by the smallest distance. We use space partition schemes for doubling 
metrics to create a partition tree. In the next two subsections, we show that this tree can 
be stored in 0(n log n) space, and that a subtree induced by any subset S C V can be 
extracted efficiently. 

Let us first briefly introduce the notion of a space partition tree, that is used in the 
remainder of this paper. Precise definitions and proofs (in particular a proof of existence 
of such a partition tree) can be found in Appendix IE1 

The basic idea is to construct a sequence So, Si, ... , §m of partitions of V. We 
require that §o = {{ v } '■ v £ V}, an d Sa/ = {V}, and in general the diameters of the 
sets in Sk are growing exponentially in k. We also maintain the neighbourhood structure 
for each Sfe, i.e., we know which sets in Sk are close to each other (this is explained 
in more detail later on). Notice that the partitions together with the neighbourhood 
structure are enough to approximate the distance between any two points x,y — one 
only needs to find the smallest k, such that the sets in Sk containing x and y are close 
to each other (or are the same set). 



There are two natural parameters in this sort of scheme. One of them is how fast 
the diameters of the sets grow, this is controlled by r £ M, r > 1 in our constructions. 
The faster the set diameters grow, the smaller the number of partitions is. The second 
parameter is how distant can the sets in a partition be to be still considered neighbours, 
this is controlled by a nonnegative integer 77 in our constructions. The smaller this pa- 
rameter is, the smaller the number of neighbours is. Manipulating these parameters 
allows us to decrease the space required to store the partitions, and consequently also 
the running time of our algorithms. However, this also comes at a price of lower quality 
approximation. 

In what follows, each is a subpartition of E>k+i for k = 0, . . . , M— 1. That is, the 
elements of these partitions form a tree, denoted by T, with S being the set of leaves 
and § M being the root. We say that S e Ej is a child of S* € §,-+i in T if S* C S*. 

Let ro be smaller than the minimal distance between points in V and let r\, = r J Vo. 
We show (in Appendix [Bb that Sfe-s and T satisfying the following properties can be 
constructed in polynomial time: 

(1) Exponential growth: Every S G §j is contained in a ball of radius ^ :) r2~ , '/ (r— 1). 

(2) Small neighbourhoods: For every S G the union [J{B rj (v) : v € S} crosses 
at most A 3+r; sets 5" from the partition Sj — we say that S knows these S". We also 
extend this notation and say that if S knows 5", then every d£S knows 5". 

(3) Small degrees: For every S* G S J+ i all children of S* know each other and, 
consequently, there are at most A' ,+3 children of S* . 

(4) Distance approximation: If v, v* £ V are different points such that v e S\ £ Sj, 
v E S2 E Sj+i and v* eSfG Sj, »" e S 2 * £ and S2 knows but Si does 
not know S^, then 



For any e > 0, the t and r\ constants can be adjusted so that the last condition 
becomes rj < d(v, v*) < (1 + s)rj (see Remark[32l). 

Remark 1. We note that not all values of r and 77 make sense for our construction. We 
omit these additional constraints here. 



Let us now show how to efficiently compute and store the tree T. Recall that the leaves 
of T are one point sets and, while going up in the tree, these sets join into bigger sets. 

Note that if S is an inner node of T and it has only one child S' then both nodes S 
and S' represent the same set. Nodes S and S' can differ only by their sets of acquain- 
tances, i.e. the sets of nodes known to them. If these sets are equal, there is some sort 
of redundancy in T. To reduce the space usage we store only a compressed version of 



Let us introduce some useful notation. For a node v of T let set(u) denote the set 
corresponding to v and let level(v) denote the level of v, where leaves are at level 
zero. Let S a , Sb be a pair of sets that know each other at level j a b and do not know each 
other at level j a i, — 1. Then the triple (S a , Sb,j a b) is called a meeting of S a and Sb at 
level j ab . 




2.1 The compressed tree f and additional information at nodes 



the tree T. 



Definition 2 (Compressed tree). The compressed version off, denoted T, is obtained 
from T by replacing all maximal paths such that all inner nodes have exactly one child 
by a single edge. For each node v off we store level(u) (the lowest level of set (v) 
in T) and a list of all meetings of set(u), sorted by level. 

Obviously T has at most 2n — 1 nodes since it has exactly n leaves and each in- 
ner node has at least two children but we also have to ensure that the total number of 
meetings is reasonable. 

Note that the sets at nodes of T are pairwise distinct. To simplify the presentation we 
will identify nodes and the corresponding sets. Consider a meeting m = (S a , Sb,j a b)- 
Let p a (resp. p ) denote the parent of S a (resp. Sb) in T. We say that S a is responsible for 
the meeting m when level(p a ) < level(pb) (when level(p a ) = level(pb), both S a 
and Sb are responsible for the meeting m). Note that if S a is responsible for a meeting 
(S a , Sb, jab), then S a knows Sb at level level(p a ) — 1. From this and Property|2]of the 
partition tree we get the following. 

Lemma 3. Each set in T is responsible for at most X 3+ri meetings. 

Corollary 4. There are < (2n — 1) X 3+ri meetings stored in the compressed tree T, i.e. 
T takes 0(n) space. 

Lemma 5. One can augment the tree T with additional information of size 0(nA 3+I '), 
so that for any pair of nodes x,yofT one can decide if x and y know each other, and 
if that is the case the level of the meeting is returned. The query takes 0{rj log A) time. 

Proof. For each node v in T we store all the meetings it is responsible for, using a 
dictionary D(m) — the searches take 0(log(A 3+I ')) = 0(r] log A) time. To process the 
query it suffices to check if there is an appropriate meeting in D(x) or in D(y). □ 

In order to give a fast subtree extraction algorithm, we need to define the following 
operation meet. Let u, v € T be two given nodes. Let v{j) denote the node in T on 
the path from v to the root at level j, similarly define u(j). The value of meet(w, v) 
is the lowest level, such that v(j) and u(j) know each other. Such level always exists, 
because in the end all nodes merge into root and nodes know each other at one level 
before they are merged (see Property [3] of the partition tree). A technical proof of the 
following lemma is moved to AppendixICldue to space limitations. 

Lemma 6. The tree T can be augmented so that the meet operation can be performed 
in 0(ji\og A log log n) time. The augmented T tree can be stored in 0(X 3+ri n log n) 
space and computed in polynomial time. 

2.2 Fast subtree extraction 

For any subset S C V we are going to define an S-subtree of T, denoted T(S). Intu- 
itively, this is the subtree of T induced by the leaves corresponding to S. Additionally 
we store all the meetings in T between the nodes corresponding to the nodes of T(S'). 



More precisely, the set of nodes of T(S) is defined as {A n 5 : ACT/ and A is 
a node of f }. A node Q of f (S) is an ancestor of a node R of t(S) iff i? C Q. This 
defines the edges of T(S). Moreover, for two nodes A, B of T such that both A and B 
intersect S, if A knows B at level j, we say that AnS knows B n 5 in T(S) at level j. 
A triple (Q, _R, j'q_r), where jqh is a minimal level such that Q knows R at level jq R , 
is called a meeting. The /eve/ of a node Q of T(5) is the lowest level of a node A of T 
such that Q = Ad S. Together with each node Q of T(5) we store its level and a list 
of all its meetings (Q, R,jQR). A node Q is responsible for a meeting (Q, i?, i) when 
level(parent(<5)) < level(parent(i?)). 

Remark 7. The subtree T(5) is not necessarily equal to any compressed tree for the 
metric space (S,d\s^)- 

In this subsection we describe how to extract T(S) from T efficiently. The extraction 
runs in two phases. In the first phase we find the nodes and edges of T(S) and in the 
second phase we find the meetings. 

Finding the nodes and edges of T(5) We construct the extracted tree in a bottom-up 
fashion. Note that we can not simply go up the tree from the leaves corresponding to S 
because we could visit a lot of nodes of T which are not the nodes of T(S). The key 
observation is that if A and B are nodes of T, such that AnS and B n S are nodes of 
T(S) and C is the lowest common ancestor of A and B, then C D S is a node of T(S) 
and it has level level(C). 

1 . Sort the leaves of T corresponding to the elements of S according to their inorder 
value in T, i.e., from left to right. 

2. For all pairs (A, B) of neighboring nodes in the sorted order, insert into a dictionary 
M a key-value pair where the key is the pair (level(lcaj.(A, B)), lca^(A, B)) 
and the value is the pair (A, B). The dictionary M may contain multiple elements 
with the same key. 

3. Insert all nodes from 5 to a second dictionary P, where nodes are sorted according 
to their inorder value from the tree T. 

4. while M contains more than one element 

(a) Let x = (I, C) be the smallest key in M. 

(b) Extract from M all key-value pairs with the key x, denote those values as 
(A\, B\), . . . , (A m ,B m ). 

(c) SetP = P\[J i {A i ,B i }. 

(d) Create a new node Q, make the nodes erased from P the children of Q. Store I 
as the level of Q. 

(e) Insert C into P. Set origin(Q) = C. 

if) If C is not the smallest element in P (according to the inorder value) let Ci 
be the largest element in P smaller than C and add a key-value pair to M 
where the key is equal to (level(lcaj(Ci, C)), lcaj(Ci, C)) and the value is 

(Q,C). 



(g) If C is not the largest element in P let C r be the smallest element in P larger 
than C and add a key-value pair to M where the key is given by the pair 
(level(lca|(C, C r )), lcaj(C, C r )) and the value is the pair (C, C r ). 

Note that in the above procedure, for each node Q of T(S) we compute the corre- 
sponding node in T, namely origin(Q). Observe that origin(Q) is the lowest com- 
mon ancestor of the leaves corresponding to elements of Q, and origin(Q) D S = Q. 

Lemma 8. The tree T can be augmented so that the above procedure runs in 0{k log k) 
time and when it ends the only key in M is the root of the extracted tree 

Proof. All dictionary operations can be easily implemented in 0(log k) time whereas 
the lowest common ancestor can be found in 0(1) time after an 0(n)-time preprocess- 
ing (see O). This preprocessing requires 0(n) space and has to be performed when T 
is constructed. Since we perform O(k) of such operations 0(k log k) is the complexity 
of our algorithm. □ 

Finding the meetings in T(S) We generate meetings in a top-down fashion. We con- 
sider the nodes of T(S) in groups. Each group corresponds to a single level. Now as- 
sume we consider a group of nodes m, . . . , it{ at some level I. Let v\, . . . , Vf be the set 
of children of all nodes Ui in T(S). For each node Uj, i = 1, . . . , t' we are going to find 
all the meetings it is responsible for. Any such meeting {vi,x,j) is of one of two types: 

1. parent(x) £ {ui, . . . , u t }, possibly parent(a;) = parent(wj), or 

2. parent(a;) ^ {ui, . . . , w t }, i.e. level(parent(.T)) > £. 

The meetings of the first kind are generated as follows. Consider the following set 
of nodes of T (drawn as grey disks in Figure [TJ. 

L = {x : x is the first node on the path in T from origin(ui) to origin(?; J ■), 

for some i = 1, . . . , t, j = 1, . . . , t'} 

We mark all the nodes of L. Next, we identify all pairs of nodes of L that know each 
other. By Lemma|3]there are at most A 3+, 't' = 0(t!) such pairs and these pairs can be 
easily found by scanning, for each icL, all the meetings x is responsible for and such 
that the node x meets is in L. In this way we identify all pairs of children [vi, Vj) such 
that Vi knows Vj, namely if x, y € L and x knows y in T, then xDS knows yf)S in T(S). 
Then, if Vi knows Vj, the level of their meeting can be found in 0(t log A log log n) 
time using operation meet(origin(wi), origin(i>j)) from Lemma|6] Hence, finding 
the meetings of the first type takes 0(X 3+V log A rt' log log n) time for one group of 
nodes, and 0(\ 3+v log A rfc log log n) time in total. 

Finding the meetings of the second type is easier. Consider any second type meeting 
(vi,w, I). Let Uj be the parent of V{. Then there is a meeting (uj,w, level(uj)) stored 
in Uj. Hence it suffices to consider, for each Uj all its meetings at level level(wy). 
For every such meeting (uj, w, level(uj)), and for every child Vi of Uj we can apply 
meet(origin(ui), origin(ui)) from Lemma|6]to find the meeting of m and w. For 




Fig. 1. Extracting meetings. The figure contains a part of tree T. Nodes corresponding 
to the nodes of T(5) are surrounded by dashed circles. The currently processed group 
of nodes (ui, i = 1, . . . , k) are filled with black. Nodes from the set L are filled with 
gray. The nodes below the gray nodes are the the nodes Vj, i.e. the children of nodes m 
int(S). 



the time complexity, note that by Property [2] of the partition tree, a node Uj meets 
\3+n — 0(1) nodes at level level(u J ). Since we can store the lists of meetings sorted 
by levels, we can extract all those meetings in 0(A 3+,) ) time. For each meeting we 
iterate over the children of Uj (Property[3]of the partition tree) and apply Lemma|6] This 
results in 0(A 3+,; log A r log log n) time per a child, hence 0(X 3+7] log A rk log log n) 
time in total. 

After extracting all the meetings, we sort them by levels in 0(k log k) time. 
We can claim now the following theorem. 

Theorem 9. For a given set S C V (\S\ — k) we can extract the S-subtree of the com- 
pressed tree T in time 0(X 3+V log A rfc(log k + log log n)) = 0(k(log k + log log n)). 

3 Pseudospanner construction and applications in approximation 

In this section we use the subtree extraction procedure described in the previous section, 
to construct for any set SC^a graph that is essentially a small constant stretch span- 
ner for S. We then use it to give fast approximations algorithms for several problems. 

3.1 Pseudospanner construction 

Definition 10. Let G — (V, Eq) be an undirected connected graph with a weight func- 
tion wq : Eq — > K + . A graph H = (V,Eh), Eh C Eq with a weight function 
wh ■ Eh — > R+ is an /-pseudospanner for G if for every pair of vertices u, v G V 
we have <Iq(u, v) < dn(u, v) < f ■ <1q(u, v), where dc and dn are shortest path met- 
rics induced by wq and wh- The number f in this definition is called the stretch of the 



pseudospanner. A pseudospanner for a metric space is simply a pseudospanner for the 
complete weighted graph induced by the metric space. 



Remark 11. Note the subtle difference between the above definition and the classical 
spanner definition. A pseudospanner H is a subgraph of G in terms of vertex sets and 
edge sets but it does not inherit the weight function wq. We cannot construct spanners 
in the usual sense without maintaining the entire distance matrix, which would require 
prohibitive quadratic space. However, pseudospanners constructed below become clas- 
sical spanners when provided the original weight function. 

Also note, that it immediately follows from the definition of a pseudospanner that 
for all uv G Ejj we have wg(u, v) < wh(u, v). 

In the remainder of this section we let (V, d) be a metric space of size n, where 
d is doubling with doubling constant A. We also use T to denote the hierarchical tree 
data structure corresponding to (V, d), and ?; and t denote the parameters of T. For any 
S C V, we use T(S) to denote the subtree of T corresponding to S, as described in the 



previous section. Finally, we define a constant C(r), r) = 




Theorem 12. Given T and set S C V, where \S\ = k, one can construct a C(rj, t)- 
pseudospanner for S in time 0(fc(log k + log log n)). This spanner has size 0(k). 

The proof is in the appendix. 

Remark 13. Similarly to Property |4] of the partition tree, we can argue that the above 
theorem gives a (1 + e)-pseudospannerfor any e > 0. Here, we need to take r = 1 + | 
and?7 = 0(^). 

Remark 14. It is of course possible to store the whole distance matrix of V and con- 
struct a spanner for any given subspace S using standard algorithms. However, this 
approach has a prohibitive 0(n 2 ) space complexity. 



3.2 Applications in Approximation 

Results of the previous subsection immediately give several interesting approximation 
algorithms. In all the corollaries below we assume the tree T is already constructed. 

Corollary 15 (Steiner Forest). Given a set of points S C V, \S\ = k, together with 
a set of requirements R consisting of pairs of elements of S, a Steiner forest with total 
edge-length at most 2C(?y, t)OPT=(2 + e)OPT, for any e > can be constructed in 
time 0(k(log 2 k + log log n)). 

Proof. We use the 0(m log 2 n) algorithm of Cole et al. [9] (where m is the number 
of edges) on the pseudospanner guaranteed by Theorem [12] This algorithm can give a 
guarantee 2 + e for an arbitrarily small e. □ 

Similarly by using the MST approximation for TSP we get 



Corollary 16 (TSP). Given a set of points S C V, \S\ = k, a Hamiltonian cycle for S 
of total length at most 2C(r], t)OPT=(2 + e)OPT for any e > can be constructed in 
time (3(fc(log k + log log n)). 

Currently, the best approximation algorithm for the facility location problem is the 
1.52-approximation of Mahdian, Ye and Zhang [ 18 1. A fast implementation using Tho- 
rup's ideas |22| runs in deterministic 0(m log m) time, where m = |F| • |C|, and if 
the input is given as a weighted graph of n vertices and m edges, in 0(n + m) time, 
with high probability (i.e. with probability > 1 — l/n"^'). In an earlier work, Tho- 
rup [23 1 considers also the fc-center and fc-median problems in the graph model. When 
the input is given as a weighted graph of n vertices and m edges, his algorithms run in 
0(n + m) time, w.h.p. and have approximation guarantees of 2 for the fc-center problem 
and 12 + o(l) for the fc-median problem. By using this latter algorithm with our fast 
spanner extraction we get the following corollary. 

Corollary 17 (Facility Location with restricted facilities). Given two sets of points 
C C V (cities) and F C V (facilities) together with opening cost fifor each facility 
i G F, for any e > 0, a (1.52 + e)-approxitnate solution to the facility location problem 
can be constructed in time 0((\C\ + |F|)(log° (1) (|C*| + |F|) + log log |V|)), w.h.p. 

The application of our results to the variant of FACILITY LOCATION with unre- 
stricted facilities is not so immediate. We were able to obtain the following. 

Theorem 18 (Facility Location with unrestricted facilities). Assume that for each 
point of n-point V there is assigned an opening cost f(x). Given a set of k points 
C C V,forany e > 0, a {i.OA-\-e)-approximate solution to the facility location problem 
with cities ' set C and facilities ' set V can be constructed in time 0(k log k(\og°^ k + 
log log n)), w.h.p. 

The above result is described in Appendix [E] Our approach there is a reduction to 
the variant with restricted facilities. The general, rough idea is the following: during the 
preprocessing phase, for every point x G V we compute a small set F(x) of facilities 
that seem a good choice for x, and when processing a query for a set of cities C, we 
just apply Corollary [TTIto cities' set C and facilities' set [J ceC F(c). 

Corollary 19 (fc-center and fc-median). Given a set of points C C V and a number 
r G N, for any e > 0, one can construct: 

(i) a (2 + e)-approximate solution to the r -center problem, or 

(ii) a (12 + e)-approximate solution to the r -median problem 

in time 0(|C|(log \C\ + log log |V|)), w.h.p. 

4 Dynamic Minimum Spanning Tree and Steiner Tree 

In this section we give one last application of our hierarchical data structure. It has a dif- 
ferent flavour from the other applications presented in this paper since it is not based on 



constructing a spanner, but uses the data structure directly. We solve the Dynamic Min- 
imum Spanning Tree / Steiner Tree (DMST/DST) problem, where we need to maintain 
a spanning/Steiner tree of a subspace X C V throughout a sequence of vertex additions 
and removals to/from X. 

The quality of our algorithm is measured by the total cost of the tree produced 
relative to the optimum tree, and time required to add/delete vertices. Let \V\ — n, 
\X\ = k. Our goal is to give an algorithm that maintains a constant factor approximation 
of the optimum tree, while updates are polylogarithmic in k, and do not depend (or 
depend only slightly) on n. It is clear that it is enough to find such an algorithm for 
DMST Due to space limitations, in this section we only formulate the results. Precise 
proofs are gathered in Appendix|F] 

Theorem 20. Given the compressed tree T(V), we can maintain an 0(1) -approximate 
Minimum Spanning Tree for a subset X subject to insertions and deletions of vertices. 
The insert operation works in O (log 5 k + log log n) time and the delete operation works 
in 0(log 5 k) time, k = \X\. Both times are expected and amortized. 
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A Related Work 



In the next few paragraphs we review different approaches to this problem, state the 
differences and try to point out the advantage of the results presented here. 

Universal Algorithms In the case of Steiner Tree and TSP results pointing in the 
direction studied here have been already obtained. In the so called, universal approxi- 
mation algorithms introduced by Jia et. al lfl6l . for each element of the request we need 
to fix an universal solution in advance. More precisely, in the case of STEINER Tree 
problem for each n6F we fix a path ir v , and a solution to S is given as {ir v : v 6 S}. 
Using universal algorithms we need very small space to remember the precomputed 
solution and we are usually able to answer queries efficiently, but the corresponding ap- 
proximation ratios are relatively weak, i.e, for Steiner Tree the approximation ratio 
is 0(log 4 n/ log log n). Moreover, there is no direct way of answering queries in 0(k) 
time, and in order to achieve this bound one needs to use similar techniques as we use 
in Section 12.21 In our model we loosen the assumption that the solution itself has to 
be precomputed beforehand, but the data output of the preprocessing is of roughly the 
same size (up to polylogarithmic factors). Also, we allow the algorithm slightly more 
time for answering the queries and, as a result are able to improve the approximation 
ratio substantially — from polylogarithmic to a constant. 

Spanners and Distance Oracles The question whether the graph can be approximately 
represented using less space than its size was previously captured by the notion of 
spanners and approximate distance oracles. Both of these data structures represent the 
distances in the graphs up to a given multiplicative factor /. The difference is that the 
spanner needs to be a subgraph of the input graph hence distances between vertices 
are to be computed by ourselves, whereas the distance oracle can be an arbitrary data 
structure that can compute the distances when needed. However, both are limited in 
size. For general graphs (2t — l)-spanners (i.e., the approximation factor is / = 2t — 1) 
are of size 0(n 1+1 /') and can be constructed in randomized linear time as shown by 
Baswana and Sen (TJ. On the other hand, Thorup and Zwick [24] have shown that the 
(2t — 1) -approximate oracles of size 0(tn 1+1 / t ), can be constructed in 0(imn 1+1 /*) 
time, and are able to answer distance queries in 0(t) time. It seems that there is no 
direct way to obtain, based on these results, an algorithm that could answer our type of 
queries faster then 0(k 2 ). 

The construction of spanners can be improved in the case of doubling metric. The 
papers Ml 2171 give a construction of (1 + e)-spanners that have linear size in the case 
when e and the doubling dimension of the metric are constant. Moreover, Har-Peled 
and Mendel lfl2l give O(nlogn) time construction of such spanners. A hierarchical 
structure similar to that of ifTTl and the one we use in this paper was also used by 
Roditty Ifl9ll to maintain a dynamic spanner of a doubling metric, with a 0(log n) up- 
date time. However, all these approaches assume the existence of a distance oracle. 
When storing the whole distance matrix, these results, combined with known approxi- 
mation algorithms in the classical setting [18 22 23 9], imply a data-structure that can 
answer Steiner Tree, Facility Location with restricted facilities and /c-Median 
queries in O(klogk) time. However, it does not seem to be easy to use this approach 



to solve the variant of FACILITY LOCATION with unrestricted facilities. To sum up, 
spanners seem to be a good solution in our model in the case when a 0(n 2 ) space is 
available for the data structure. The key advantage of our solution is the low space re- 
quirement. On the other hand, storing the spanner requires nearly linear space, but then 
we need 0(n) time to answer each query. The distance matrix is unavailable and we 
will need to process the whole spanner to respond to a query on a given set of vertices. 

Sublinear Approximation Algorithms Another way of looking at the problem is the at- 
tempt to devise sublinear algorithm that would be able to solve approximation problems 
for a given metric. This study was started by Indyk |fT31 who gave constant approxima- 
tion ratio 0(ra)-time algorithms for: Furthest Pair, fc-MEDiAN (for constant fc), 
Minimum Routing Cost Spanning Tree, Multiple Sequence Alignment, 
Maximum Traveling Salesman Problem, Maximum Spanning Tree and 
Average Distance. Later on Badoiu et. al [6 j gave an 0(n log n) time algorithm for 
computing the cost of the uniform-cost metric FACILITY LOCATION problem. These al- 
gorithms work much faster that the 0(n 2 )-size metric description. However, the paper 
contains many negative conclusions as well. The authors show that for the following 
problems 0(n)-time constant approximation algorithms do not exists: general metric 
Facility Location, Minimum-Cost Matching and /c-Median for k = n/2. In 
contrary, our results show that if we allow the algorithm to preprocess partial, usually 
fixed, data we can answer queries in sublinear time afterwards. 

Dynamic Spanning Trees The study of online and dynamic Steiner tree was started in 
the paper of lfl4l . However, the model considered there was not taking the computation 
time into account, but only minimized the number of edges changed in the Steiner 
tree. More recently the Steiner tree problem was studied in a setting more related to 
ours 0315 1411 II . The first three of these paper study the approximation ratio possible 
to achieve when the algorithm is given an optimal solution together with the change 
of the data. The efficiency issue is only raised in [11], but the presented algorithm in 
the worst case can take the same as computing the solution from scratch. The problem 
most related to our results is the dynamic minimum spanning tree (MST) problem. The 
study of this problem was finished by showing deterministic algorithm supporting edge 
updates in polylogarithmic time in ifPJl . The dynamic Steiner tree problem is a direct 
generalization of the dynamic MST problem, and we were able to show similar time 
bounds. However, there are important differences between the two problems that one 
needs to keep in mind. In the case of MST, by definition, the set of terminals remains 
unchanged, whereas in the dynamic Steiner tree we can change it. On the other hand 
we cannot hope to get polylogarithmic update times if we allow to change the edge 
weights, because this would require to maintain dynamic distances in the graph. The 
dynamic distance problem seems to require polynomial time for updates [ 10 1. 

B Partition tree — precise definitions and proofs 

To start with, let us recall partition and partition scheme definitions. 



Definition 21 (Jia et al [ 16 1, Definition 1). A (r, a, I)-partition is a partition ofV into 
disjoint subsets Si such that diam Si < ra for all i and for all v G V, the ball B r (v) 
intersects at most I sets in the partition. 

A (a, I) partition scheme is an algorithm that produces (r, er, I)-partition for arbi- 
trary r£M,r>0. 

Lemma 22 (similar to Jia et al |16|, Lemma 2). Let r\ > be a nonnegative integer. 
For V being a doubling metric space with doubling constant A, there exists (2~ rl , A 3+? ') 
partition scheme that works in polynomial time. Moreover, for every r the generated 
partition S r has the following property: for every S £ S r there exists leader(5) G S 
such that S C B 2 -i-i r (lea.der(S)). 

Proof. Take arbitrary r. Start with Vq = V. At step i for i = 0, 1, . . . take any Vi G Vi 
and take Si = B 2 -i-i r (vi)nVi. Set Vi+± = Vi\Si and proceed to next step. Obviously, 
Si C B2-i-i r (vi), so diam Si < 2 _7) r and we set leader(5i) = Vi. 

Take any v G V and consider all sets Si crossed by ball B r (v). Every such set is 
contained in B^i +2 -v)rW) C i?2r(w), which can be covered by at most X 3+v balls of 
radius 2~ v ~ 2 r. But for every i ^ j, d(vi, vj) > 2~ r, ~ 1 r, so every leader of set crossed 
by B r (v) must be in a different ball. Therefore there are at most A 3+?? sets crossed. □ 

Let us define the space partition tree T. 

Algorithm 23 Assume we have doubling metric space (V, d) and (2~'', X 3+n ) partition 
scheme form Lemma\22\ Let us assume r\ > 2 and let r be a real constant satisfying: 

- < 1, Le, t > + 1. 

- T < 2". 

Then construct space partition tree T as follows: 

1. Start with partition §o — {{ v } '■ v £ ^}» an d Tq < min{d(u, v) : u, v G V, u ^ 
v}. For every {v} G So let leader({v}) = v. Let S = §o- 

2. Let j := 0. 

3. While Sj has more than one element do: 

(a) Fix := rrj = t^tq. 

(b) Let §!•_(_ i be a partition of the set Lj — {leader(S') : S G Sj} generated by 
given partition scheme for r — 2rj + \. 

(c) Let S j+i := {{J{S : leader(5) G S'} : S' G 

(d) Set leader(lJ{S' : leader(5) G S'}) = leader(5") for any S' G 

(e) j := j + 1. 

Note that for every j, Sj is a partition of V. We will denote by leader^ (v) the 
leader of set S G E>j that v G S. 

Definition 24. We will say that S* G Sj+i is a parent of S G Sj if leader(5) G S* 
(equally S C S*). This allows us to consider sets Sj generated by Algorithm \23\ as 
nodes of a tree T with root being the set V. 



Lemma 25. For every j and for every ogS the following holds: 

d(v, leader,- («)) < r,. 

T — 1 

fVoo/ Note that 

i 

leader^ (v)) < d(leaderi(u), leaderi_i(w)) 
We use bound from Lemma l22l 

Vdfleader.fu), leader.^! (w)) < V 2" ? '" 1 • 2rV = 2~ t? r T ~ r < i 



□ 



Lemma 26. For every j, for every S G Sj, the union of balls \J{B r Av) : v G S 1 } 
crosses at most X 3+ri sets from the partition Sj. 

Proof. For j = this is obvious, since ro is smaller than any d(u, v) for u ^ v. Let us 
assume j > 0. 

Let v G S G Sj, v* e S* e Sj, S ^ S* and d(v, v*) < r> Then, using Lemmal25l 
<i(leaderj(u), leader^ (v*)) < d(leaderj(w), v) + d(v, V*) -\-d(v*, leaderj(u*)) < 

t2~ t i 



< r i{ 1 + 2 —[ r i) < 2r 



Since, by partition properties, B2 rj (leader j(v)) crosses at most C sets from S'j and 
leader^*) G B2 rj (leader,- (w)), this finishes the proof. □ 

Definition 27. We say that a set S G Sj knows a set S' G Sj if \J{B r Jv) : v G 
S} Pi S' 0. We say that v G V knows S' G Sj if v G S G Sj ant/ 5 fenows <S" or 
S = S'. 

Note that Lemma[26l implies the following: 

Corollary 28. A set (and therefore a node too) at a fixed level j has at most X 3+1J 
acquaintances. 

Lemma 29. Let S G Sj be a child of S* G §j+i and let S know S' G Sj. Then either 
S' C S* or S* knows the parent of S'. 

Proof. Assume that S' is not a child (subset) of S* and let S** G S J+ i be the parent 
of S'. Since S knows S', there exist v G S, v' G S' satisfying d(v,v') < Tj. But 
r 3 < r i+i and w G S** and u' G S**. □ 

Lemma 30. Set S* G Sj /zas at most X 3+n children in the tree T. 



Proof. By construction of level j, let 5 G §j-i be such a set that leader(5) = 
leader(5*) (in construction step we divided sets of leaders Lj_i into partition 
Let 5' G Sj_i be another child of S*. Then, by construction and assumption that 

t <2 r >: 

d(leader(5'),leader(5)) < 2r 3 ■ T^ 1 = 2~'%- < rj-i. 

However, by Lemmal26l B r i (leader(5)) crosses at most X 3+Tl sets at level j — 1. 
That finishes the proof. □ 

Lemma 31. Let v,v* G V be different points such that v G S% G Sj, v G 52 G Sj+i 
andv* 6 SJ 6 Sj, v* G 5^ G Sy+i ami S*2 fcnows 5| but Si does not know 5*. Then 



r, <d(«,«*)< (l + — T ) 
For t = 2 one/ rj = 2 this implies rj < d(v, v*) < 6r 



TV ; 



Proof. Since 5i and SI do not know each other, v and w* are in distance at least fy. 
Since 52 knows 5^, there exist u G 52 and it* G 5| such that d(u,u*) < fj+i. 
Therefore 

d(v,v*) < 

< d(i>, leader(52)) + e?(leader(52), w) + u*)+ 
+d(leader(52),it*) + d(leader(52*),w*) < 

r2" 7 ' / 4r2- ? '\ 

< 4 ■ — -r j+1 + r j+ i = [1 + -—[) Tr r 

□ 

Remark 32. Imagine we want in Lemmal31~lto obtain bound rj < d(v,v*) < (l + e)rj 
for some small 1 > e > 0. Take r = 1 + § . We want here the following: 4r2 J 1 < §, 
i.e., 2~ v < 12 (i +£ ) < §4- Tn en we have 

d{v,v*) <{l+ — j-J"j- < [1 + - J < (1 + e)r,-. 

Note, that to obtain this we need 2'' = O(jt). Note, that conditions in Algorithm |231 
for 77 and r are much weaker than we assumed here. 



C Implementation of the meet and j ump operations 

In this section we provide realizations of meet and jump operations that work fast, i.e., 
roughly in 0(log log n) time. 

Let us now recall the semantics of the meet operation, which was used in the fast 
subtree extraction in Section l2~2l For nodes u and v, by u(j) and v(j) we denote the 
ancestor of u (resp. v) in the tree at level j. The meet(u, u) operation returns the lowest 



level j such that u(j) and v(j) knows each other. This operation can be performed in 
0(A r ' +3 log log n) time. 

Operation jump is used by the dynamic algorithms in Section [4] and its semantics 
is as follows. In the compressed tree, for each set S we store a list of all meetings of S, 
sorted by level. The jump(v, i), given node v and level i outputs the set S and a meeting 
(S, S' , j) such that v 6 S and j is the lowest possible level such that i < j. Informally 
speaking, it looks for the first meeting of a set containing v such that its level is at least 
i. The jump operation works in 0(log log n + loglog log stretch). If we require that 
there is some meeting at level i somewhere, maybe distant from v, in the tree, the time 
reduces to (3((log?7 + log log A) log logn). 

C.l Path partition 

In order to implement the jump and meet operations efficiently we need to store addi- 
tional information concerning the structure of T, namely a path partition. The following 
lemma defines the notion. 

Lemma 33. The set of edges of the tree T can be partitioned into a set of paths P = 
{Pi, . . . , P m } such that each path starts at some node ofT and goes down the tree only 
and for each node v of the tree T the path from v to the root contains edges from at most 
|~log 2 n \ paths of the path decomposition P. Moreover P can be found 0(n) time. 

Proof. We use a concept similar to the one used by Sleator and Tarjan in |21 1. We start 
from the root and each edge incident to the root is a beginning of a new path. We then 
proceed to decompose each subtree of the root recursively. When considering a subtree 
rooted at a node v we lengthen the path going down from the parent of v by one edge 
going to the subtree containing the largest number of nodes (breaking ties arbitrarily). 
Each of the remaining edges leaving v starts a new path. 

It is easy to see that each path goes down the tree only. Now consider a node v. 
When we go up from v to the root, every time we reach an end of some path from P, the 
size of the subtree rooted at the node we move into doubles. This ends the proof since 
there are at most 2n — 1 vertices. □ 

We now describe additional information related to the path decomposition that we 
need to store. Each node v of T maintains a set paths, where (i, level) £ paths(w) if 
the path from v to the root contains at least one edge of the path Pi, and the lowest such 
edge has its bottom endpoint at level level. In other words, Pi enters the path from v to 
the root at level level. We use two different representations of the set paths simultane- 
ously. One is a dictionary implemented as a hash table, and the other is an array sorted 
by level. Because of the properties of the path decomposition P from Lemma 1331 for 
each node v we have paths(u) < |~log 2 (n)]. 

Let Pi G P be a path with vertices {vx, . . . , Vt] (given in order of increasing level). 
We define interior(P;) to be the set {v%, . . . , Vt-i}, i.e. we exclude the top vertex of 
Pi. We also define toplevel(p) to be the level of Vt-x, i.e. the highest level among 
interior nodes of Pj. 



C.2 The meet operation 



In order to benefit from the path decomposition to implement meet operation, we also 
need to store adjacency information for paths, similar to the information we store for 
single nodes. Let P a ,Pb £ P be two paths, such that their interior nodes know each 
other at level j a b, but not at level j Q (, — 1. Then the triple (P a , Pb, jab) is called a 
meeting of P a and Pb at level j a b- We also say that P a and P& meet at level j a b), or that 
they know each other. This definition is just a generalisation of a similar definition for 
pairs of nodes of T. We may also define a notion of responsibility for paths which is 
analogous to the definition for nodes and formulate a lemma analogous to Lemma|5] 

Lemma 34. One can augment the tree T with additional information of size O (n A 3+r ' ), 
so that for any pair of paths P Xl P y G P one can decide if P x and P y know each 
other, and if that is the case the level of the meeting is returned. The whole query takes 
0{r]\og\) time. 

Now, suppose we are given two nodes u, v G T and we are to compute meet(w, v). 
The following lemma provides a crucial insight into how this can be done. 

Lemma 35. Let G paths(w), which means that the path Pi reaches the path 

from u to the root at level j and assume that nodes u, v start to know each other at 
level j uv — meet(u, v), where j uv < toplevel(p). Then either (i, I) G paths(v)/or 
some £, or there exists i', such that paths Pi and P^ know each other, Pi is responsible 
for their meeting, and (i , t) G paths(v) for some I. Moreover, this condition can be 
checked in 0(A ,,+3 ) time. 

Proof. Since j uv < toplevel(p) we know that at level toplevel(p) paths from u 
to the root and from v to the root either merged, or else nodes on those paths at level 
toplevel(Pi) know each other. If those paths merged, than Pi intersects the path from 
v to the root, and we know that (i, *) G paths(u). This can be checked in hash table 
forpaths(u) in 0(1) time. 

Otherwise as i' we take Pi to be the lowest path Py/ G P, such that (i",£) G 
paths(w) for some I, and toplevel(P") > toplevel(Pi). To check if this occurs, 
we take Si — the interior node of P,; with the highest level, and iterate over all S[ known 
by Si and look for path containing 5- in the hashtable for paths(u). As 5,; knows at 
most X v+3 sets, the bound follows. □ 

Now, using Lemma 1351 we can do a binary search over the elements of paths(w), 
and find a pair (i u ,ju) £ paths(u) such that meet(u,u) < toplevel(p u ) and 
meet(u,v) > j u . Namely, we look for the lowest path in paths(w) that satisfies 
Lemma [351 Similarly, we can find (i v ,j v ) G paths(w). Since paths P iu and P iv know 
each other, we simply use Lemma [34] to find the exact level j where they meet,and as 
the result of meet(u, v) return max(j„, j v , j). We need to take the maximum of those 
values, because paths Pi u and P^ could possibly meet before they enter the paths from 
u and v to the root. 

Lemma 36 (Lemma [6] restated). The tree T can be augmented so that the meet op- 
eration can be performed in 0(ry log A log log n) time. The augmented T tree can be 
stored in 0(X 3+rt n log n) space and computed in polynomial time. 



Proof. Since paths(u)| < [log 2 n] we perform O(loglogn) steps of the binary 
search. During each step we perform 0(X V+3 ) searches in a hash table, thus we can 
find the result of meet(u, v) in 0(log log n) time. 

The space bound follows from Corollary [4] (the additional log?i factor in the space 
bound comes from the size of paths(.T) for each node x). Now we need only to describe 
how to obtain running time independent of the stretch of the metric. In order to compute 
the T tree (without augmentation) we can slightly improve our construction algorithm: 
instead of going into the next level, one can compute the smallest distance between 
current sets and jump directly to the level when some pair of sets merges or begins to 
know each other. □ 

Remark 37. We could avoid storing paths in arrays by maintaining, for each path in P, 
links to paths distant by powers of two in the direction of the root (i.e. at most log log n 
links for each path). 

Also, to obtain better space bound, we could use a balanced tree instead of the hash 
tables to keep the first copy of paths. If we use persistent balanced trees, we can get 
an 0(n log log n) total space bound. However, in that case the search time would be 
increased to 0((loglogn) 2 ) for one call to the meet operation. 



C.3 The jump operation 

Lemma 38. The compressed tree T can be enhanced with additional information of 
size 0(X ri+3 n\ogn) in such a way that the jump(t>,i) operation can be performed in 
0(log log n + log log log stretch) time, where stretch denotes the stretch of the met- 
ric. If we require that there is some meeting at level i somewhere in the tree (possibly not 
involving v), the jump operation can be performed in 0((logi] + log log A) loglogn) 
time. 

Proof. To calculate jvmp(v,i), we first look at paths(u) and binary search lowest 
path P E paths(f) such that the highest node in P has level greater than i. If P = 
{vi, . . . , v t } (given in order of increasing level), that means that level(ui) < i < 
level(w t ). This step takes O(loglogn) time. 

To finish the j ump operation, we need, among meetings on path P, find the lowest 
one with the level not smaller than i. As levels are numbered from to log stretch, 
this can be done using y-Fast Tree data structure [25 26]. The y-Fast Tree uses linear 
space and answers predecessor queries in O(loglogw) time, where u is the size of the 
universe, here u — log stretch. 

To erase dependency on stretch, note that according to Corollary [4] where are 
at most M := (2n — l)X n+3 meetings in tree T. Therefore, we can assign to every 
level j, where some meeting occurs, a number < n(j) < M and for two such levels 
j and j', j < j' iff n(j) < n(j'). The mapping n(-) can be implemented as a hash 
table, thus calculating n(j) takes O(l) time. Instead of using y-Fast Trees with level 
numbers as universe, we use numbers n(- ■ ■). This requires 0(log log n + log log M) = 
0((log rj + log log A) log log n) time, but we need to have the key i in the hash table, 
i.e., there needs to be some meeting at level i somewhere in the tree. □ 



D Omitted Proofs 



Proof (ofTheorem \12\l . 

Recall that nodes of T(S) are simply certain subsets of S, in particular all single- 
element subsets of S are nodes of T(S). Associate with every node A of T(S), an 
element a of A, which we will call leader(yl), so that: 

- if A = {a} (which means A is a leaf in T(5)), then leader(yl) = a, 

- if A has sons A\, . . . , A m in T(S), then let leader(yl) be any of leader(A;), 
i = 1, . . . , m. 

If two nodes A, B in T(S) know each other, we will also say that their leaders leader(A) 
and leader(_B) know each other. Also, if A is the parent of B, and a ^ b, where 
a = leader)^) and b = leader(S), we will say that a is the parent of b. We will also 
say that a beats b at level L, where L is the level at which A appears as a node — this 
is exactly the level where b stops being a leader, and is just an ordinary element of a set 
where a is a leader. 

Now we are ready do define the pseudospanner. Let H — (S, E), where E contains 
all edges uv, h^ii such that: 

1 . wis the father of u, or 

2. u and v know each other. 

We cannot assign to these edges their real weights, because we do not know them. 
Instead, we define wh(u,v) to be an upper bound on d(u, v), which is also a good 
approximation of d(u, v). In particular: 

1. If u is a son of v and v beats u at level j, we put wh(u, v) = 2 T ^_^ rj. 

2. If u and v first meet each other at level j, we put wh(u, v) = (l + iT T 2 _^ Trj. 

We claim that H is a C(rj, r)-spanner for V of size 0(n). 

It easily follows from Lemmas |251 and [3T1 that d(u, v) < wh{u, v), hence also for 
any u, v 6 V we have d(u, v) < dn{u, v), where dn is the shortest distance metric in 
H. 

Now, we only need to prove that for every pair of vertices v,v* £ X, we have 
dn(v, v*) < C(r), r)d(v, v*). The proof is similar to that of Lemma |3TI As before, let 
v e Si £ Sj, v e S 2 e Sj+i and v* e SI e Sj, v* e 5^ e and assume S 2 
knows S*2 but S\ does not know Sf (all that is assumed to hold in T, not in T(5)). 
Then, since S\ and S^ do not know each other, v and v* are at distance at least rj. On 
the other hand, since S 2 knows m T, we also have that S 2 D S knows S 1 ^ D S in 
f (S). Let u = leader(5 2 ("1 S), u* = leader^ n S). It follows from the definition 
of H, that uu* is an edge in H and it has weight Wjj{u, u*) < ^1 + ^zj- J Trj. 

Now consider the path from v to S 2 D S in T(5). Initially, v is the leader of the 
singleton set {v}, then it might get beaten by some other vertex vi, then v% can get 



beaten by some other vertex 1)2, and so on. Finally, at some level u emerges as a leader. 
This gives a path v = vq , v\ , . . . , v m = u in H . We have 



w H (viV i+1 ) = 2 _ n i+1 , 



where U+\ is the level at which Vi+i beats Vi. Since all these levels are different and all 
of them are at most j + 1, we get: 

d H {v,u) < 22 w H (vi,v i+ i) < 2 - r-Q y"V < 

r 2->J r j+i - l 2r r2-'' 

< 2 — r < • -rfj. 

r — 1 r — 1 r — 1 r — 1 

We can argue in the same way for v* and w*. Joining all 3 bounds we get: 

d H (v, v*) < d H {v, u) + wh(u, u*) + cLh{u* ,v*) < 



/, 8t2-v \ „ 2r r2- T ' 

< H rr, + 2 • rr,-. 

V r-1 / 3 r-1 r-1 3 



and finally 



d H (t;,«*) < U+f^y) 2 3 -"j rr, ■ < C(T,r,)d(v,v*). 



Since every edge of the spanner either corresponds to a father-son edge in T(5) or to 
a meeting of two nodes in T(S), it follows from Lemma|4]that H has size 0(n). The 
time complexity of constructing H is essentially the same as that of constructing T(5), 
i.e. 0(k(logk + log log n)). □ 

E Facility location with unrestricted facilities 

In this section we study the variant of FACILITY LOCATION with unrestricted facilities 
(see Introduction). We show that our data structure can be augmented to process such 
queries in 0(fc(log k + log log n)) time, with the approximation guarantee of 3.04 + e. 

Our approach here is a reduction to the problem solved in Corollary[T7] The general 
idea is roughly the following: during the preprocessing phase, for every point x G V 
we compute a small set F(x) of facilities that seem a good choice for x, and when 
processing a query for a set of cities C, we just apply Corollary fTTlto cities' set C and 
facilities' set \J c£C F(c). In what follows we describe the preprocessing and the query 
algorithm in more detail, and we analyze the resulting approximation guarantee. 

In this section we consider a slightly different representation of tree T. Namely, 
we replace each edge (u,parent(v)) of the original TT with a path containing a node 
for each meeting of v. The nodes on the path are sorted by level, and for any of such 
nodes v, level(u) denotes the level of the corresponding meeting. The new tree will 
be denoted T. 



E.l Preprocessing 



Let us denote vis(j) = yl+ ^zj- J Tr j> i- e - v i s (j) is the upper bound from Lemmal3T1 
Note that vis(j) is an upper bound on the distance between two points v and w such 
that v £ Si and w £ S2 for two sets Si , S2 that know each other and belong to the same 
partition Sj. For a node v of tree T we will also denote vis(w) = vis(level(w)). 

In the preprocessing, we begin with computing the compressed tree T. Next, for 
each node v of T we compute a point in the sets which v knows, with the smallest 
opening cost among these points. Let us denote this point by cheap(w). Finally, for 
each igF consider the path P in T from the leaf corresponding to {x} to the root. Let 
P = (vi, V2, ■ ■ ■ , v \p\) an d for * = 1) • • • 5 \P\ let x i = cheap^i). Let p the smallest 
number such that f(x p ) < n/eo ■ vis(v p ), where £q is a small constant, which we 
determine later; now we just assume that £0 £ (0, 1]. Let q be the smallest number such 
that q > p and f(x q ) < £ ■ vis(v q ). If p exists, we let F(x) = {v pi v p+ i, . . . , v q } and 
otherwise F(x) = 0. 

Lemma 39. For any x £ V, ^(x)! = O(logn). □ 

Proof. Let r = p + |"log r (n/eQ)]. Note that for any i — p, . . . ,r — 1, level(ui) < 
level(«j+i). Hence 

vis(level(v p )) = vis(0)T level ^) < < 

< -2 i-i = -2- ■ vis(level(u r )). 

n n 

It follows that f(xp) < £0 • vis(level(u r )). Then q < r, since x p € set(v r ). □ 

It is straightforward to see that all the sets F(x) can be found in 0(n log n) time. 

The intuition behind our choice of F(x) is the following. If f(xi) > ti/eq -vis(vi), 
then the opening cost of Xi is too high, because even if n cities contribute to the opening 
of Xi, each of them has to pay more than vis(«j) on average (the constant £0 here is 
needed to deal with some degenerate case, see further), i.e. more than an approximation 
of its connection cost. Hence it is reasonable for cities in set(wj) to look a bit further 
for a cheaper facility. On the other hand, when /(xj) < £0 • vis(i>i), then even if city x 
opens facility Xi alone it pays much less than its connection cost to Xi. Since the possible 
cheaper facilities are further than Xi, choosing Xi would be a (1 + £0) -approximation. 

E.2 Query 

Let C C V be a set of cities passed the query argument. Denote k = \C\. Now for each 
c e C we choose the set of facilities 

Fk(c) = {cheap(u) : v £ F(c) and /(cheap(w)) < k/so ■ vis(u)}. 

Similarly as in Lemma [39] we can show that |-Ffe(c)| = O(logfc). Clearly, F^(c) can 
be extracted from F(c) in O(logfc) time: if F(c) is sorted w.r.t. the level, we just 



check whether /(cheap(w)) < k/eo ■ vis(u) beginning from the highest level ver- 
tex and stop when this condition does not hold. Finally, we compute the union F(C) = 
lJ ceC ,Ffc(c) U {cheap(root(T)} and we apply Corollary [P71 to cities' set C and fa- 
cilities' set F(C). Note that F contains cheap(root(T) - i.e. the point of V with the 
smallest opening cost — this is needed to handle some degenerate case. 



E.3 Analysis 



Theorem 40. Let SOL be a solution of the facility location problem for the cities' set 
C and facilities' set V. Then, for any e > 0, there are values of parameters t, r\ and Eq 
such that there is a solution SOL' of cost at most (2 + e)cost(SOL), which uses only 
facilities from set F(C). 

Proof. We construct SOL' from SOL as follows. For each opened facility x of SOL, 
such that x £ F(C), we consider the set C(x) of all the cities connected to x in SOL. 
We choose a facility x' e F(C) and reconnect all the cities from C(x) to x' . 

Let c* be the city of C{x) which is closest to x. Consider the path P of T from 
the leaf corresponding to c* to the root. Let v be the first node on this path such that v 
knows x and 

Note that by the first inequality of Lemma [3T1 for the first node w on P that knows x, 
vis(w) <[l+ ~ _ 1 )rr leml{w) < \1 + —j jrd(x,c*). 



On the other hand, again by Lemma [3T1 for the first node uonF such that vis(u) > 
^g),thereisvis( M )<r^g). 



e °^ x ^ , there is vis(w) < r ttjt^h ■ Hence, since v is the higher of w and u, 



vi8(«) < rmax|^||, (l + ^^(z.c*)} . (2) 

First we consider the non-degenerate case when Fk(c*) ^ 0. Let v p , . . . , v q be the sub- 
path of P which was chosen during the preprocessing. Letp' 6 {p, . . . , q} be the small- 
est number such that cheap(p') < k/eo ■ vis(v p r). Recall that F^(c*) = {cheap(ui) : 
p' < i < q}- If v {vp/, . . . , v q }, then -F&(c*) contains a facility of opening cost at 
most f(x), at distance at most vis(u). Otherwise v is higher than v q on P, so Fk(c*) 
contains a facility of cost at most vis(w), at distance at most £o • vis(u). To sum up, 
Fk(c*) contains a facility of cost at most max{/(x), £o ■ vis(w)}, at distance at most 
vis(w). Denote it by x'. We reconnect all of C(x) to x' . 



Now let us bound the cost of connecting C(x) to x'. From the triangle inequal- 
ity, (0, and the fact that c* is closest to x we get 

Y d ( c ' x ')< Y d(c,x) + \C{x)\vis(v) 

cec(x) cec(x) 



< Y d(c,x) + rmax <| e f(x), (l + ^— Y d {c,x) 
<re f(x)+(l + T (l + ^^-) S \ Y d ^ x )- 0) 



Now let us expand the bound for f(x'): 

f(x') < max I f(x), £ t max | j^^j i (l + ~~~y) c *) 

< (i + eoT )/(a;)+eor(l + ^-^) • £ d(c,s). (4) 



c6C(i) 



From (O and <j4j> together we get 

f(x')+ Y d (^ x ')< (5) 



cGC{x) 

<(l + 2 £o r)/(x)+('l + (T + r £ o)(l + ^^)) ^ d(c,a;). 

Finally, we handle the degenerate case when Fk(c*) = 0. Then we just connect all 
C{x) to the facility x' = cheap(root(T )), i.e. the facility with the smallest opening 
cost in V. Note that Fk(c*) = implies that for any point y G V (and hence also for 

x), 

f(y) > fc/eovis(root(T)) > fc/e max d(x,y). 

x.y^Y 

Hence, Y,cec(x) d ( c i x ') ^ \ c ( x )\ max XiVev - d(a;, j/) < (|C(a;)|/n)£o/(x) < £ /(z)- 
It follows that 

f(x')+ Y d(c,x')<(l + e )f(x). (6) 

cec(x) 

From © and ©, we get that 

cost(SOL') < f 1 + (r + rea) (l + - — -J J cost(SOL). 

One can easily seen that the constants Eq, t and 77 can be adjusted so that the coefficient 
before cost(SOL) is arbitrarily close to 2. This proves our claim. □ 

From Theorem[40]and Corollary [17] we immediately get the following. 



Corollary 41 (Facility Location with unrestricted facilities, Theorem[l8lrestated). 

Assume that for each point ofn-point V there is assigned an opening cost f(x). Given 
T and a set of k points C C V, for any e > 0, a (3.04 + e)-approximate solution to 
the facility location problem with cities' set C and facilities' set V can be constructed 
in time 0(k log k(\og°^ k + log log n)), w.h.p. 

F Dynamic Minimum Spanning Tree and Steiner Tree — 
algorithm details 

In this section we give details on the proof of Theorem [20] that is, we describe an 
algorithm for MST in the static setting and than we make it dynamic. 

We assume we have constructed the compressed tree T(V). Apart from Lemmal44l 
we treat A, rj and r as constants and omit them in the big-O notation. Recall that we are 
given a subset X C V, \V\ = n, \X\ = k. In the static setting, we are to give a constant 
approximation of a MST for the set X in time almost linear in k. In the dynamic setting, 
the allowed operations are additions and removals of vertices to/from X and the goal is 
achieve polylogarithmic times on updates. 

F.l Static Minimum Spanning Tree 

We first show how the compressed tree T(V) can be used to solve the static version 
of the Minimum Spanning Tree problem, i.e. we are given a set X C V and we want 
to find a constant approximation of the Minimum Spanning Tree of X in time almost 
linear in k. 

Let r (called the root) be any fixed vertex in X and let Li(X) = {x G X : 
meet(x, r) = i + 1}. Also, let D(t) = fl + iT T 2 _i ) t. As a consequence of Prop- 
erty 2] of the partition tree, we get the following: 

Lemma 42. Let x, y <G Li(X). Then r*j < d(x, r) < D(r)ri. Moreover, at level i + 
1 + log T (2D(r)) = i + 0(1) of T the sets containing x and y are equal or know each 
other. 

A spanning tree T of X is said to be layered if it is a sum of spanning trees for each 
of the sets {r} U L. L (X). The following is very similar to Lemma 8 in Jia et al. fl6l . 

Lemma 43. There exists a layered tree Tj, of X with weight at most 0(l)OPT, where 
OPT is the weight of the MST ofX. 

Proof. Let Topt be any minimum spanning tree of X with cost OPT. Let m = [log T (l+ 
L>(r))] and C 3 = {r} U \J{Li(X) : i mod m = j} for < j < m. Double the edges 
of Topt, walk the resulting graph along its Euler-tour and shortcut all vertices not be- 
longing to D . In this way we obtain the tree Tq PT — a spanning tree of CJ with cost at 
most 20PT. Clearly \S^=§ ^opt ' s a s P annm g tree for X with cost at most 2mOPT. 

Let xy be an edge of Tq PT such that x and y belong to different layers, that is, 
x G Li(X), y G Lii(X), i < i'. Then i' > i + m and due to Lemma |3T1 



d(y, r) > r v > (1 + D{r))n > n + d(x, r). 



Therefore d(x, y) > r$ and, as d(x, r) < D(r)rj, rf(a;, r) < D(r)d(x, y). Moreover: 

d(y, r) < d(x, y) + d(x, r) < d(x, y) + D(r)ri < (1 + D(r))d(x, y). 

Therefore by replacing xy by one of the edges xr or yr, we increase the cost of this edge 
at most (1 + D(t)) times. If we replace all such edges xy in all Tgir f° r ^ j < m > 
we obtain a layered spanning tree of X with cost at most 2m(l + D(r))OPT. □ 

Our strategy is to construct an 0(l)-approximation Ti(X) to MST for each layer 
separately and connect all these trees to r, which gives an 0(l)-approximation to MST 
for X by the above lemma. The spanning tree Ti(X) is constructed using a sparse 
spanning graph Gi(X) of Li(X). In order to build Gi(X) we investigate the levels T 
at which the sets containing vertices of Li(X) meet each other. The following technical 
Lemma shows that we can extract this information efficiently: 

Lemma 44. Let x € X and let i < j be levels ofT. Then, for each level off in range 
we can find all sets known to x, in total time 0(log log log stretch + loglogn + 
-^ ??+3 (j — *))• If we are given level io such that there exists a meeting (possibly not 
involving x) at level io, the above query takes 0((log r] + log log A) log log n + \io — 
i\ + X rl+3 (j - i)) time. 

To perform these queries, the tree needs to be equipped with additional information 
of size 0(X 2r i +6 n). 

Proof. First we perform jump(a;, i) and, starting with the returned meeting, we go up 
the tree, one meeting at a time, until we reach level j. In this way, we iterate over all 
meetings of x between levels i and j. We store in the tree, for each meeting (S, S', i'), 
the current set of acquaintances of S and S'. The answer to our query can be retrieved 
directly from this information. Also note that this extra information takes 0(X 2ri+6 n) 
space, as there are at most (2n — l)A r ' +3 meetings and each set knows at most \ v+3 
other sets at any fixed level. If we are given level io, we may simply perform jump ( x , io) 
and walk the tree to level i. □ 

Theorem 45. Given the compressed tree T(V) and a subset X <Z V one can construct 
an 0(l)-approximation T to MST in time 0(fc(log log n + log k). 

Proof. Designate any relas the root of the tree. Split all the remaining vertices into 
layers Li(X). For each nonempty layer pick a single edge connecting a vertex in this 
layer to r and add it T. Furthermore, add to T an approximate MST for each layer, 
denoted T^X), constructed as follows. 

Consider a layer Li(X) with ki > elements. We construct a sparse auxiliary 
graph Gi(X) with Li(X) as its vertex set. We use Lemma [44] to find for each vertex 
x G Li{X) and every level in / 6 [i — log T k + 1, i + 1 + \og T (2D(i))] all the sets 
known to x at level I in T. Using level i + 1 = meet(a;, r) as the anchor io, this can be 
done in 0(loglogn + logfc) time per element x € Li(X). Using this information we 
find, for every I as above and every set S at level / known to at least one x G Li(X), a 
bucket Bi s- This bucket contains all x G Li(X) that know S at level I. Note that the 
total size of the buckets is 0{ki log k), because we are scanning 0(log k) levels, and a 
vertex can only know 0(1) sets at each level. Now we are ready to define the edges of 



E(Gi(X)). For every bucket B^s, we add to E(Gi(X)) an arbitrary path through all 
elements in B^s- We also assign to each edge on this path a weight of 2D(r)ri-i. 

Since the total size of the buckets is O(fcilogfc), we also have that Gi(X) has 
0(h log k) edges. We let T % (X) be the MST of Note that T t {X) can be found in 

time 0(ki log k) by the following adjustment of the Rruskal's algorithm. If we consider 
buckets ordered in the increasing order of I, the edges on the paths are given in the 
increasing order of their lengths. At every step of Rruskal's algorithm, we keep current 
set of connected components of Ti(X) as an array, where every x G Li(X) knows 
the ID of its component. We also keep the size of each component. Whenever two 
components are joined by a new edge, the new component inherits ID from the bigger 
subcomponent. This ensures that for every x £ Li (X) we change its ID at most [log ki\ 
times. 

We now claim that 

Lemma 46. The total weight of T^X) is 0{l)(OPT L ^ X ) + n), where OPT Li(X ) « 
the weight of the MST for Li(X). 

Proof (Proof of Lemmal46[. First, note that the Rruskal's algorithm connects the whole 
bucket Bi t s when considering edges of length 2_D(r)n_i. Therefore we may modify 
graph Gi(X) to G'^X) such that all pairs of vertices in B^s are connected by edges 
of length 2D(r)ri-%, without changing the weight of the MST. Let dt be the metric 
in G'^X). By Lemma |3T1 di(x,y) > d(x,y), as the sets containing x and y at level, 
where x and y are not placed in the same bucket, do not know each other. If d(x, y) < 
J"i-log k, then di(x, y) = 2D(T)ri_i og k — 2D(r)ri/k, as both x and y meet in some 
bucket at level i — log T k + 1. Otherwise, if d(x, y) > ri_i og by Lemma [3T| again, 
di(x, y) < 2D(r)d(x, y), as both x and y know S at level I. 

Let us replace metric dby d' defined as follows: for x ^ y, d'(x, y) — 2D(r)d(x, y) 
if d(x, y) > ri_ lo g T k and d'(x, y) = n-\ 0&r k otherwise. Clearly d(x, y) < di(x, y) < 
d'(x,y). Note that d' satisfies the following condition: for each x,y,x',y' e Li(X), 
d{x,y) < d(x',y') iff d'(x,y) < d'(x',y'). Therefore, the Rruskal's algorithm for 
MST in (Zrj(X), d) chooses the same tree Topt as when run on (Li(X), d'). Let d(Tbp-r) 
(di(Topj), d'(TopT)) denote the weight of Topt with respect to metric d (resp. rfj and 
d 1 ). Let us now bound ^'(Topt). Topt consists of ki — 1 edges, so the total cost of 
edges xy such that d(x, y) < ^_i og ^ k is at most (fc, — l)2Z)(r)^ < 2D(r)ri. Other 
edges cost at most D(t) times more than when using metric d, so in total (f (Topt) < 
2D(T)(n + d(Top-r))- As ^(Topt) < ^'(Topt), and R-uskal's algorithm finds mini- 
mum MST with respect to di, the lemma is proven. □ 

Now we are ready to prove that T is an 0(l)-approximation of MST for X. Let 
OPT be the weight of MST for X, and OPT L be the weight of the optimal layered MST 
of X. We know that OPT L < 0(l)OPT by Lemma|43] 

The optimal solution has to connect r with the vertex x e X which is the furthest 
from r. We have d(r, x) > r max , where r max = r, for the biggest i with non-empty 
Li(X). It follows that OPT > r max > O(l) J2i r ^ because the fj-s form a geometric 
sequence. Thus, the cost of connecting all layers to r is bounded by (9(l)OPT. 



Moreover, Lemma|46limplies that sum of the weights of all Tj (X)-s is bounded by: 
Yj ( r< + OPT l,(x)) < 0(1) (OPT + OPT L ) < 0(1)0PT. 

i 

Thus the constructed tree T is an 0(l)-approximation of the MST of X. □ 
F.2 Dynamic Minimum Spanning Tree 

The dynamic approximation algorithm for MST builds on the ideas from the previous 
subsection. However, we tackle the following obstacles: 

- we do not have a fixed root vertex around which the layers could be constructed, 

- the number of distance levels considered when building auxiliary graphs is depen- 
dent on k, and as such can change during the execution of the algorithm, and finally, 

- we need to compute the minimum spanning trees in auxiliary graphs dynamically. 

The following theorem shows that all of these problems can be solved successfully. 

Theorem 47 (Theorem|20]restated). Given the compressed tree T(V), we can main- 
tain an 0{\)-approximate Minimum Spanning Tree for a subset X subject to insertions 
and deletions of vertices. The insert operation works in (9(log 5 k + log log n) time and 
the delete operation works in 0(log k) time, k = \X\. Both times are expected and 
amortized. 

Proof. The basic idea is to maintain the layers and the auxiliary graphs described in the 
proof of Theorem|45] However, since we are not guaranteed that any vertex is going to 
permanently stay in X, we might need to occasionally recompute the layers and graphs 
from scratch, namely when our current root is removed. However, if we always pick 
root randomly, the probability of this happening as a result of any given operation is 
< j and so it will not affect our time bounds. It does, however, make them randomized. 

The number of distance levels considered for each layer is log T k + O(l) and so 
it might change during the execution of the algorithm. This can be remedied in many 
different ways, for example we might recompute all the data structures from scratch 
every time k changes by a given constant factor. 

The above remarks should make it clear that we can actually maintain the layer 
structure and the auxiliary graph (as a collection of paths in non-empty buckets Bi : s) 
for each layer with low cost (expected and amortized) per update. We now need to show 
how to use these structures to dynamically maintain a spanning tree. We use the algo- 
rithm of de Lichtenberg, Holm and Thorup (LHT) [13 1 that maintains a minimum span- 
ning tree in a graph subject to insertions and deletions of edges, both in time 0(log 4 n), 
where n is the number of vertices. 

We are going to use the LHT algorithm for each auxiliary graph separately. Note 
that inserting or deleting a vertex corresponds to inserting or deleting 0(log k) edges to 
this graph, as every vertex is in 0(log k) buckets. 

In case of insertion of a vertex x, we need 0(log logn) time to perform meet (v, root) 
and find the appropriate layer. Non-empty layers and their non-empty buckets may be 



stored in a dictionary, so the search for a fixed layer or a fixed bucket is performed 
in 0(log k) time. Having appropriate layer, we insert x into all known buckets, taking 
0(log 4 k) time to update edges in each bucket. Therefore the insert operation works in 
expected and amortized time 0(log 5 k + log log n). 

In case of deletion of a vertex x, we may maintain for each vertex in X a list of 
its occurrences in buckets, and in this way we may fast access incident edges. For each 
occurrence, we delete two incident to x edges and connect the neighbors of x. Therefore 
the delete operation works in expected and amortized time 0(log 5 k). □ 



